Guidelines to Enhance 3-D Stencil Codes on the Intel Xeon Phi Coprocessor

نویسندگان

  • Mario HERNÁNDEZ
  • Juan M. CEBRIÁN
  • José M. CECILIA
  • José M. GARCÍA
چکیده

Accelerators like the Intel Xeon Phi aim to fulfill the computational requirements of modern applications, including stencil computations. Stencils are finite-difference algorithms used in many scientific and engineering applications for solving large-scale and high-dimension partial differential equations. However, programmability on such massively parallel architectures is still a challenge for inexperienced developers. This paper provides firm foundations to guide developers in maximizing the benefits of hardware-software co-design for computing 3-D stencil codes running on the Intel Xeon Phi (Knights Corner) architecture. We propose a set of guidelines to optimize stencil codes based on a C/C++ OpenMP implementation. The guidelines are evaluated using three kernels that are widely applied to simulate heat, acoustic diffusion as well as isotropic seismic wave equations. Our experimental results yield performance gains over 25x when compared to high-level sequential implementations (e.g., Matlab).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward efficient distribution of MPDATA stencil computation on Intel MIC architecture

The multidimensional positive definite advection transport algorithm (MPDATA) belongs to the group of nonoscillatory forward-in-time algorithms, and performs a sequence of stencil computations. MPDATA is one of the major parts of the dynamic core of the EULAG geophysical model. The Intel Xeon Phi coprocessor is the first product based on the Intel Many Integrated Core (Intel MIC) architecture. ...

متن کامل

Cluster-level tuning of a shallow water equation solver on the Intel MIC architecture

The paper demonstrates the optimization of the execution environment of a hybrid OpenMP+MPI computational fluid dynamics code (shallow water equation solver) on a cluster enabled with Intel Xeon Phi coprocessors. The discussion includes: 1. Controlling the number and affinity of OpenMP threads to optimize access to memory bandwidth; 2. Tuning the inter-operation of OpenMP and MPI to partition t...

متن کامل

Efficient Hybrid Execution of C++ Applications using Intel(R) Xeon Phi(TM) Coprocessor

The introduction of Intel R © Xeon Phi TM coprocessors opened up new possibilities in development of highly parallel applications. The familiarity and flexibility of the architecture together with compiler support integrated into the Intel C++ Composer XE allows the developers to use familiar programming paradigms and techniques, which are usually not suitable for other accelerated systems. It ...

متن کامل

SIMD Implementation of a Multiplicative Schwarz Smoother for a Multigrid Poisson Solver on an Intel Xeon Phi Coprocessor

In this paper, we discuss an efficient implementation of the three-dimensional multigrid Poisson solver on a many-core coprocessor, Intel Xeon Phi. We have used the modified block red-black (mBRB) Gauss-Seidel (GS) smoother to achieve sufficient degree of parallelism and high cache hit ratio. We have vectorized (SIMDized) the GS steps in the smoother by introducing a partially SIMDizing techniq...

متن کامل

PhiTM for DNA Sequence Analysis

Genetic information is increasing exponentially, doubling every 18 months. Analyzing this information within a reasonable amount of time requires parallel computing resources. While considerable research has addressed DNA analysis using GPUs, so far not much attention has been paid to the Intel Xeon Phi coprocessor. In this paper we present an algorithm for large-scale DNA analysis that exploit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015